Towards Replicability in Parsing
نویسندگان
چکیده
We investigate parsing replicability across 7 languages (and 8 treebanks), showing that choices concerning the use of grammatical functions in parsing or evaluation and the influence of the rare word threshold, as well as choices in test sentences and evaluation script options have considerable and often unexpected effects on parsing accuracies. All of those choices need to be carefully documented if we want to ensure replicability.
منابع مشابه
بررسی مقایسهای تأثیر برچسبزنی مقولات دستوری بر تجزیه در پردازش خودکار زبان فارسی
In this paper, the role of Part-of-Speech (POS) tagging for parsing in automatic processing of the Persian language is studied. To this end, the impact of the quality of POS tagging as well as the impact of the quantity of information available in the POS tags on parsing are studied. To reach the goals, three parsing scenarios are proposed and compared. In the first scenario, the parser assigns...
متن کاملAn improved joint model: POS tagging and dependency parsing
Dependency parsing is a way of syntactic parsing and a natural language that automatically analyzes the dependency structure of sentences, and the input for each sentence creates a dependency graph. Part-Of-Speech (POS) tagging is a prerequisite for dependency parsing. Generally, dependency parsers do the POS tagging task along with dependency parsing in a pipeline mode. Unfortunately, in pipel...
متن کاملReplicability Analysis for Natural Language Processing: Testing Significance with Multiple Datasets
With the ever growing amount of textual data from a large variety of languages, domains, and genres, it has become standard to evaluate NLP algorithms on multiple datasets in order to ensure a consistent performance across heterogeneous setups. However, such multiple comparisons pose significant challenges to traditional statistical analysis methods in NLP and can lead to erroneous conclusions....
متن کاملWord vectors, reuse, and replicability: Towards a community repository of large-text resources
This paper describes an emerging shared repository of large-text resources for creating word vectors, including pre-processed corpora and pre-trained vectors for a range of frameworks and configurations. This will facilitate reuse, rapid experimentation, and replicability of results.
متن کاملThe validity, diagnostic value and replicability of Bender Visual-Motor Gestalt Test in traumatic brain injury patients
Introduction: Bender Gestalt test is one of the most famous neuropsychological tests that is simple and it can be used to examine brain injuries. The objective of this research was to investigate the validity, diagnostic strength and the replicability of the Bender Visual-Motor Gestalt Test in patients with traumatic brain injury (TBI). Methods: 240 participants were tested in a case-control st...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017